home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Software Vault: The Gold Collection
/
Software Vault - The Gold Collection (American Databankers) (1993).ISO
/
cdr26
/
memspd.zip
/
MEMSPD.DOC
< prev
next >
Wrap
Text File
|
1993-04-01
|
7KB
|
128 lines
First, a few of warnings. It is best to run MEMSPD without any TSRs or
memory managers loaded. If the 386 option is specified, MEMSPD will use
32-bit instructions, which will cause unpredictable effects if the CPU is
not a 386 or 486. The 386 option will also cause MEMSPD to attempt to go
into protected mode to test extended memory if it thinks there is any.
Most memory managers hide the extended memory from MEMSPD, but if they don't
the attempt will fail with unknown results.
The purpose of MEMSPD is to give an indication of the relative speed of
different areas of memory in a PC, to compare memory speeds of different
computers, and to learn interesting facts about how memory and the CPU
cache work. It does this by timing a repeated string instruction (LODS)
that copies memory to a register. By making the repeat count be large
(0x800) the time to execute the instructions to read the timer and set up
the instruction becomes negligible and the result is very close to the
actual time to execute each memory access.
MEMSPD uses the 8-bit and 16-bit forms of the LODS instruction to access
memory (and the 32-bit form if -386 is specified). It also does two
accesses for each form - one in which it attempts to have the CPU cache
filled with data other than what it is reading (all cache misses), and one
in which it attempts to have the data already in the cache (all cache hits)
For each access, the current time is read from timer 0, 0x800 accesses are
made, the time is read again and the difference in microseconds is printed.
If the difference between the cache-miss and cache-hit times is less than
.04 it is assumed that the difference is due to experimental error and is
not printed.
For best results, MEMSPD should not be run under any sort of memory
manager, as they can hide extended memory or disturb the timings. When
the 386 option is specified, MEMSPD does not try to check to see if the
CPU actually is a 386 or greater - unpredictable results will occur if this
option is used with a 286 or below. MEMSPD assumes that Timer 0 is being
run in Mode 3 (the default for most BIOSs), but it is possible that a TSR
or BIOS might run it in a different mode. I would suggest doing a sanity
check, such as is described below, to make sure your numbers make sense.
On very slow machines (e.g. 8Mhz or less) the time it takes to do one
repeated LODS may exceed the resolution of the timer giving wildly
inaccurate numbers.
My Everex AGI 386-25 gives the following results:
00000 - 9FFFF byte 0.24-0.33, word 0.24-0.42, dwrd 0.24-0.59
A0000 - B7FFF byte 0.76, word 0.76, dwrd 1.37
B8000 - BBFFF byte 2.09, word 1.78, dwrd 3.75
BC000 - BFFFF byte 2.05-1.92, word 2.17, dwrd 3.75
C0000 - C7FFF byte 0.76, word 0.76, dwrd 1.37
C8000 - DFFFF byte 1.18, word 2.19, dwrd 4.24
E0000 - EFFFF byte 0.76, word 0.76, dwrd 1.37
F0000 - FFFFF byte 0.24-0.33, word 0.24-0.42, dwrd 0.24-0.59
100000 - 43FFFF byte 0.24-0.33, word 0.24-0.42, dwrd 0.24-0.59
The following can be inferred from these numbers.
1. The memory from 0 - 9FFFF and F0000 and above is cached, since there are
two sets of numbers for each test. Also, all of this memory is 32-bit
memory, since the access times for byte, word and dword are all the
same.
2. The memory from A0000 - B7FFF, which is VGA video memory, is 16-bit
since the byte and word times are the same but the 32-bit time is
greater (the byte and word operations each take one memory access but
the dword takes two).
3. The memory from B8000 - BFFFF is the video memory that is currently
being used. The times for it vary wildly each time MEMSPD is run,
probably because the CPU access are often (and randomly) blocked
because the memory is being accessed by the video card to display
the screen.
4. The memory from C0000 - C7FFF is the video card ROM is 16-bit.
5. The memory from C8000 - DFFFF contains various other ROMS as well as
areas that are unused. The times here, since they include areas for
which there is no memory, represent the basic bus byte access times.
6. The memory from F0000 - FFFFF is the BIOS. The times shown are with
BIOS shadowing enabled, and are the same as the 32-bit memory. If
BIOS shadowing is turned off, the times become:
byte 0.76, word 0.76, dwrd 1.37
Running MEMSPD with the CPU cache disabled gives the following:
00000 - 9FFFF byte 0.24-0.33, word 0.24-0.42, dwrd 0.24-0.59
00000 - 9FFFF byte 0.42, word 0.42, dwrd 0.42
This points out a very interesting attribute of this machine's CPU cache -
while cache hits are much faster than access without the cache, cache misses
for word accesses are the same and cache misses for 32-bit accesses are much
more expensive.
In order to determine if these numbers are reasonably accurate, I did the
following computations:
1. A 25 MHz machine has a clock cycle time of .04 microseconds
(1/25,000,000). The Intel 386 book shows that LODS takes 6
clocks per repeated operation, and in fact 6 * .04 is the .24
that MEMSPD reports for LODS with no wait states (cache hit).
2. According to the EISA spec (which is the closest thing I have to any
sort of documentation for the ISA bus), it takes 7 bus clocks to
access an 8-bit ISA slave. An 8 MHz bus has .125 usec per cycle
(1/8,000,000), for a total of .875 usec. Adding the .24 usec it takes
to access memory via LODS gives 1.115, which is very close to the
value of 1.18 gotten for the memory from C8000-DFFFF.
The IOB and IOW options test the access times for I/O accesses for byte and
word accesses. If IOW specifies a port address of either 1F0 or 170, and if
MEMSPD thinks there is a disk controller at that address, it will start a
read of the first sector and wait for DRQ to be asserted before doing the
test. I have discovered that some WD1003 compatible controllers will do
16-bit I/O cycles when the data register is read only if there is data to
transfer, while other always assert it. Doing the test only when the
controller is transferring data insures that 16-bit transfers will be done.
IOW with a port address of 170 or 1F0 should not be done if there is any
kind of disk cache or other TSR loaded that does disk I/O, or corruption
of disk data could occur. Likewise, IOB and IOW should not be given the
port address of an existing device if there is a TSR loaded that is using
the device.
I have included all of the source files. The assembler I used is a
non-standard one that we use at work. The C was compiled with a Metaware
compiler. MEMSPD was written very quickly and is neither pretty nor
is the source documented very well.
Jack Jackson. 70152,3713
3/31/93